Modified Self-organizing Maps for Line Extraction in Digitized Text Documents

نویسندگان

  • Juan Manuel Alonso-Weber
  • Inés María Galván
  • Araceli Sanchis
چکیده

. Different authors have developed modifications of the Kohonen Self-Organizing Maps to solve known combinatorial optimization problems. In this paper a modification of the Kohonen Map is proposed to solve the detection of white inter-text spaces in a digitized plain text documents. The idea relies on the fact that line extraction problem has several features which match easily with Kohonen networks, although an adaptation to the problem of the original learning rule has to be made at first. A test with different digitized text images is performed showing the ability to segment lines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corporate Decision Making with Self-Organizing Patent Maps Labeled by Technical Terms and AHP

In this paper, we propose an approach for corporate decision making with self-organizing patent maps labeled by technical terms and AHP. First, we select the patent area of interest and collect pertinent patent documents in text format. Second, we extract keywords by text mining to transform patent documents into feature vectors of the companies. Third, we input the feature matrix of technical ...

متن کامل

Segmentation of Digitized Mammograms Using Self-Organizing Maps in a Breast Cancer Computer Aided Diagnosis System

The objective of this work is to develop a digitized mammograms’ feature extraction approach using Kohonen’s Self-Organizing Maps (SOM). Once developed, the SOM network will be used as the first processing stage in a breast cancer computer aided diagnosis (CAD) system. Its role will be to offer segmented data as input to a second stage dedicated to the diagnosis task, which will be implemented ...

متن کامل

A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps

With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...

متن کامل

Landforms identification using neural network-self organizing map and SRTM data

During an 11 days mission in February 2000 the Shuttle Radar Topography Mission (SRTM) collected data over 80% of the Earth's land surface, for all areas between 60 degrees N and 56 degrees S latitude. Since SRTM data became available, many studies utilized them for application in topography and morphometric landscape analysis. Exploiting SRTM data for recognition and extraction of topographic ...

متن کامل

Word-Streams for Representing Context in Word Maps

The most prominent use of Self-Organizing Maps (SOMs) in text archiving and retrieval is the WEBSOM. In WEBSOM, a map is first used to reduce the dimensionality of the huge term frequency table by training a so-called word-category map. This wordcategory map is then used to convert the individual documents into their respective document signatures (i.e. histogram of words) which form the basis ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003